Reasoning about modular datatypes with Mendler induction 


Paolo Torrini Tom Schrijvers 

Department of Computer Science, KU Leuven, Belgium 
{p.torrini, tom.schrijvers}@cs.kuleuven.be 


In functional programming, datatypes a la carte provide a convenient modular representation of re¬ 
cursive datatypes, based on their initial algebra semantics. Unfortunately it is highly challenging to 
implement this technique in proof assistants that are based on type theory, like Coq. The reason is 
that it involves type definitions, such as those of type-level fixpoint operators, that are not strictly 
positive. The known work-around of impredicative encodings is problematic, insofar as it impedes 
conventional inductive reasoning. Weak induction principles can be used instead, but they consider¬ 
ably complicate proofs. 

This paper proposes a novel and simpler technique to reason inductively about impredicative 
encodings, based on Mendler-style induction. This technique involves dispensing with dependent 
induction, ensuring that datatypes can be lifted to predicates and relying on relational formulations. 
A case study on proving subject reduction for structural operational semantics illustrates that the 
approach enables modular proofs, and that these proofs are essentially similar to conventional ones. 


1 Introduction 

Developing high-quality software artifacts, including programs as well as programming languages, can 
be very expensive, and so can formally proving their properties. This makes it highly desirable to max¬ 
imise reuse and extensibility. Modularity plays an essential role in this context: a component is modular 
whenever it can be specified independently of the whole collection - therefore, a modular characterisa¬ 
tion of an artifact implies that its extension does not require changes to what is already in stock. 

In functional programming, it is natural to rely on a structured characterisation of components based 
on recursive datatypes. However, conventional datatypes are not extensible - each one fixes a closed sef 
of consfrucfors with respect to which case analysis may have to be exhaustive, hence each case implicitly 
depends on the whole collection. An elegant solution to this tension between structural characterisa¬ 
tion and modularity, also known as the expression problem, has been found with the notion of modular 
datatype (MDT) - i.e., datatypes a la carte, introduced in Haskell by Swierstra lfT6l . The definition of an 
MDT consists of two distinct parts: the grammar, as a non-recursive structure based on a functor, and the 
recursive datatype, as the recursive closure of the functor by a type-level fixed point. Grammar functors 
behave as modules, as they can be defined independently and combined together by coproduct. 

In Haskell, an MDT can be easily implemented in terms of conventional datatypes, which can be 
used to define fhe grammar as well as the recursive closure (as recalled in Section]^. However, Haskell’s 
datatype definition of the type-level fixpoint operator is not strictly positive, and therefore it is problem¬ 
atic from the point of view of less liberal type systems. As a general-purpose programming language, 
Haskell relies on types that do not enforce totality (i.e., either termination or productivity). This makes 
type checking easier in the presence of non-termination. Unfortunately, allowing for non-total programs 
can lead to inconsistency under a program-as-proof interpretation. For this reason, proof assistants based 
on the Curry-Howard correspondence are usually based on more restrictive type systems. Proof assis¬ 
tants such as Coq, Agda, Isabelle and Twelf, for instance, rely on a syntactic criterion of monotonicity 
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which ensures totality, by requiring that all the occurrences of an inductive datatype in its definition are 
strictly positive - hence incompatibly with the Haskell-style representation of MDTs. 

Coq is a theorem prover based on the calculus of inductive constructions (CIC) lO which extends 
the calculus of constructions (CC) Q with inductive and coinductive definitions. CC, the most expres¬ 
sive system of the lambda cube |(2l, allows for types depending on terms, type-level functions and full 
parametric polymorphism, hence also for definitions that are impredicative, in the sense of referring in 
their bodies to collections that are being defined. One of fhe main approaches fo represenf MDT in Coq, 
due fo Delaware, Oliveira and Schrijvers fT\ and implemenfed in fhe MTC/3MT framework |i6|, lakes 
advanlage of impredicalivily, and relies on fhe Church encoding of fixed poinls (as recalled in Secfion]^. 
Anolher promising approach, due fo Keuchel and Schrijvers ifTTIl . relies on conlainers - if is predicative, 
bul if involves a more indirecl represenfafion of fypes. Church encodings are purely based on CC and 
do nol involve any exlra-logical machinery - however, fhey rafher complicate inductive reasoning. Im- 
predicafive definifions have an eliminafive character lhal hides lerm slruclure, hence making if harder fo 
reason by inducfion. The solulion proposed by Delaware et al. is quile general - however, if relies on 
proof algebras lhaf pack lerms fogefher wilh proofs using Z-lypes, and fhis leads fo induclive proofs fhal 
have a significanl overhead wilh respecl fo fhe conventional, non-modular ones. 

This paper proposes a novel solulion fo fhe problem of reasoning inductively wilh impredicalively 
encoded MDT, based on fhe use of Mendler-slyle induction ifT^ [T^ [Tl. Mendler’s characterisation of 
ileralion makes if possible fo encode an inducfion principle wilhin fhe impredicalive encoding of an 
MDT. Unlike Delaware et al, we use Mendler algebras as proof algebras. This leads fo inductive proofs 
lhal are slraighlforwardly modular and ultimately closer fo convenlional ones (Section [^. Allhough 
fhis approach cannol handle dependenl induction, fhis limilafion is of lillle consequence as long as we 
are reasoning aboul relalional formulalions. Nonefheless, fhis may make if necessary fo lifl induclive 
dafafypes fo induclively defined predicales, in order fo use Ihem as inductive argumenfs in proofs. 

In order fo reason induclively on relalions, we clearly need fo rely on funclor shapes lhal can rep- 
resenl fhem as well as mulual dependencies. Such need is highlighfed fhroughoul a case sludy on fhe 
formalisalion of a language based on slruclural operational semanfics (Secfion Coq implemenlafion 
available ifTTIl l. The language, for which we prove lype preservalion, has a definifion lhal involves mulual 
dependency belween expressions and declaralions. 


2 Datatypes a-la-carte 

MDTs as inlroduced by Swierslra ifT^ are essenlially a funclional programming applicalion of fhe inilial 
algebra semanfics of induclive fypes. This consisls of associafing an induclive dalalype fo an endofunclor 
in a base cafegory, Ihen inferprefing if as fhe inilial objecf in fhe category of algebras delermined by fhe 
funclor l|9j|T3- 

In ils simplesl form, faking sels (S) as fhe base cafegory, each induclive dalalype p : S can be associ¬ 
ated wilh a covarianl endofunclor {signature functor), i.e. a map F : S —S for which Ihere exisfs a map 
{functor map) fmapp- {A B} : {A ^ B) ^ {F F B) fhal preserves identifies and composition, wilh 
A, B :S (always Irealed as implicil parameters). Semanfically, an algebra delermined by F {F-algebra) is 
a pair (C, 0) where C: S is Ihe carrier and ^ : F C —)■ C is Ihe structure map. F C can be understood as Ihe 
denolalion of a grammar based on signalure F, given carrier C. The initial objecl (pF, in^), where inf is 
an isomorphism and Ihus has an inverse outf, gives Ihe denolalion of p oblained as fhe fixpoinl closure 
of F. In Ibis way, Ihe non-recursive slruclural characterisation of p, which essentially corresponds to 
case analysis, is separated from ils recursive closure. For inslance, in a functional language which allows 
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for datatype definitions with data constructors and Haskell-style destructors (while we mainly rely on 
Coq-style and standard algebraic notation), the following 


dt_def p = Cl (ti[p/A]) | ... | (Mp/^]) 

(1) 

can be decomposed in 


dt-def FA = ci (ti) | ... | c^; (zt) 

(2) 

and 


P =df FixF 

(3) 

where Fix F is the syntactic representation of pF, i.e. 


dt_defFixF = in (out: F(FixF)) 

(4) 


For each F-algebra (C,/), the unique incoming algebra morphism from the initial algebra is determined 
by the unique mediating map fold^^c,/ • pF —)■ C. Syntactically, this corresponds to the definition of 
fold F C : {F C ^C) ^ (Fix F —)• C) as a recursive function. 

fold F C f X =df f (fmap F (fold F C f) (out x)) (5) 

Functors are composable by coproduct (+), i.e., if Fi ,^2 : S —)■ S are functors, so is F 1 +F 2 , with 

dt_def (F 1 +F 2 ) C = ini (Fi C) | inr (F 2 C) (6) 

This results in a modular definition of the inductive datatype Fix (F 1 +F 2 ) - not to be confused with 
Fix Fi+Fix F 2 . In connection with coproducts, Haskell implementations of MDTs rely on type classes to 
automate injections and projections, using smart constructors and class constraints to express subsump¬ 
tion between functors. As a concrete example, following Swierstra na, the conventional datatype 


dt_defTrm = lit (Int) | add (Trm * Trm) (7) 

can be decomposed into two modules 

dt_defTrmGiC = lit (Int) dt_def TrmG 2 C = add(C*C) (8) 

and thus modularly defined: 

TrmG =df TrmGi + TrmG 2 Trm =df FixTrmG (9) 

Moreover, given a nofion of value and a convenfional recursive definition of evaluation 

dt_def Val = val (vv : Int) eval : Trm —)■ Val 

eval (litx) =df valx (10) 


eval (add (^ 1 ,^ 2 )) =df val ((vvoeval ei) + (vvoeval ^ 2 )) 

the latter can be represented by an algebra and modularly decomposed as follows, allowing for a modular 
definition of the dynamic semantics. 


evalGi : TrmGi Val —Val 

evalGi (litx) =df valx 


evalG 2 : TrmG 2 Val —Val 

evalG 2 (add (xi,X 2 )) =df val ((vvxi) + (vvX 2 )) 

(11) 

evaiG : TrmG Val —Val 

evaiG (ini e) =df evalGi e 
evaiG (inre) =df evalG 2 ^ 



eval e =df fold TrmG Val evaiG e 

(12) 
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3 Impredicative encoding 

The MDT representation discussed so far works well with Haskell, but not with Coq. Representing F as 
an inductive datatype is not problematic, but this is not so for the fixpoint closure. Since the constructor 
of Fix F has type F (Fix F) —)■ Fix F, the datatype has a non-strictly positive occurrence in its definition, 
as parameter of the argument type - hence it is rejected by Coq. There is an analogous issue with the 
definition of fold, which is not structurally recursive. The solution to this problem adopted by Delaware 
et al. in fTl, which we summarise here, goes back to Pfenning and Paulin-Mohring |[T3l in relying on a 
Church-style encoding of fixpoint operators, thus requiring impredicative definitions. 

From the point of view of a type theoretic representation, the type of an algebra (that we may call 
Church algebra, or conventional algebra) can be identified with the type of its structure map. 

Mg^FC =df FC^C (13) 

If the initiality property of fixed points is weakened to an existence property, a fixpoint operator can be 
regarded as a function that maps an algebra to its carrier. An abstract definition of the type-level fixpoint 
operator Fix'" : (S —)> S) —)■ S can then be given, as elimination rule for F-algebras, impredicatively with 
respect to S (this requires the impredicative set option in Coq, as used in MTC/3MT fTl). 

Fix^F=rf/VA:S. Alg'^FA^A (14) 

The map fold'' F C : Alg'' F C ^ Fix'- F ^ C, corresponding to the elimination of a fixpoint value, can 
now be defined as the application of that value. 

fo\d^ FC fx=dfxC f (15) 

Relying on the functoriality of F, the in-map in'' F : F(Fix F) —)• Fix F and the out-map out'' F : Fix F —)■ 
F(Fix F) can be defined as functions. 

in''F =df Ax A/./(fmap F (fold''F A/) x) (16) 

out''F =df fold''F (F(Fix F)) (fmap F (in''F)) (17) 

Notice that the definition of fold'' F C f does not guarantee the uniqueness of the mediating map - it 
rather corresponds to a condition called quasi-initiality by Wadler 1(1911 . In order to obtain uniqueness, 
hence to ensure that in'' is an isomorphism, the following implication needs to be proved for F lITl fmiTOll . 

(Vx:Fix'^F. h{\n^Fx) = / (fmap F/j x)) {h = fold'^FC/) (18) 

Semantically, the impredicative encoding of the fixed points is closely associated with a constructor, 
usually called build, that allows for an alternative interpretation of inductive datatypes in terms of limit 
constructions, provably equivalent to the initial algebra semantics (HI. 

3.1 Indexed algebras 

A relation can be represented as a function from the type of its tupled arguments to the type P of propo¬ 
sitions. From the point of view of initial semantics, assuming P can be represented as a category, the 
modular representation of inductively defined relations only requires a shift of base category. Given a 
type K (i.e., K : Type) and assuming it can be represented as a small category, we can take the category 
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of diagrams of type in P as the base category for the relations of type A' —P. In such category, an 
endofunctor R : P) —)■ {K ^ P) that here we call indexed functor, is then associated with a map 

{indexed functor map) that preserves identities and composition. 

KR-.'d {AB\ K^P].{\/w\K.Aw^Bw)^{'iw\K.RAw^RBw) (19) 

From the point of view of the impredicative encoding, an /^-algebra can be characterised as an indexed 
map, given a carrier D : A' —)■ P. 

Alg'=' KRD =df yw.K.RDw^Dw (20) 

The corresponding fixpoint operator has type ((A'—s-P) —A'—s-P)—s-A'—)-P. 

P\x^' KR{w:K) =df MA-.K^P. k\g^' KRA ^ Aw (21) 

The structuring operators can be defined as follows: 

KR: VA (/: Alg'^'A A) {w:K). P\x^' KRw-^Aw =df XAfwe.eAf (22) 


in*=' KR{w:K):R (Fix^' K R) w ^ Fix^' KRw =df 
XxA f. f w (fmap' KR (fold'“' K R A f) w x) 


(23) 


out*^' KR{w.K)-. Fix^' KRw^R (Fix^' KR)w =df 
fold‘d' KR{R (Fix^' K R)) (fmap' K R (in*^' K R)) w 


(24) 


3.2 Proof algebras 

The impredicative encoding makes it comparatively easy to represent MDTs in Coq, but leaves us with 
the problem of how to reason inductively about them. Unlike the in-map of the categorical semantics, 
in^ is not a constructor - therefore, structural induction cannot be applied to a term of type Fix'' F. Let 
P : T —> P be a property and T the representation of an inductive datatype in the following goal, which 
we assume to be semantically provable by induction on T. 


T,w.TP g-.Pw (25) 

However, given T =df Fix'- F and the impredicative definition of Fix*-, the type T is not syntactically 
inductive, and no conventional induction principle can be applied. Nevertheless, we can prove 

Mv.T.PiW.F T.Pv = P {\rf-F w) (26) 


as this follows from the equality v = in*- F (out*- F v) which can be proved, provided in*- F is shown 
to be an isomorphism - e.g., by proving (181. Rewriting (251 with ([26l), we obtain 


T,w.F TP g' -.P F w) 


(27) 


Here it is possible to apply induction on w, since F T is an inductive datatype: however, what we actually 
get is case analysis - the recursive arguments in F F are hidden in the same sense as before, as they have 
type T rather than F T. 
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The solution adopted by Delaware et al. in Q, implemented in Coq and supported by MTC/3MT 
consists of packing an existential copy of the inductive term together with a proof that it satisfies the prop¬ 
erty, using r types. This involves replacing the conventional proof with one based on the representation 
of the goal as an algebra, i.e., a proof algebra. 

Th/: Alg^F (Iv.Pv) (28) 


By folding such an algebra, one obtains 

r,>v : r h fold*^ F (Iv. Pv)fw: Lv. P v 


(29) 


which states something weaker than the original goal (251. Nonetheless, under conditions associated 
with well-formed proof algebras in Q, ( [28] ) can be strengthened to ( |25] ). This technique is quite general, 
and it can be applied to inductive proofs in which the goals may depend on the inductive argument (i.e., 
it can deal with dependent induction). However, the proofs that are obtained in this way are essentially 
factored into two non-trivial parts - the application of a weak induction principle and a well-formedness 
proof - and therefore are quite different from conventional inductive ones. 


3.3 Looking for a simpler solution 

A natural question arises: is it possible to sacrifice some of fhe generalify of fhe MTC approach, fo obfain 
proofs fhaf look more familiar? The whole poinf of using £ fypes is fo hide dependencies: a solution 
that does not involve them and so a positive answer to our question appear more feasible, when we can 
dispense with the use of dependent induction, by finding an alfemafive, equivalenf formulation of fhe 
goal. In our schematic example (25 1 we gel such reformulation, when we can find S, Q:T ^ P and an 
indexed functor /? : (T —> P) —)> T —)■ P such lhal S =df Fix'^' T R, fhe following equivalence holds 


Ihere exisfs t s.l. T h t : Vw :T. S w Qw 


iff 


fhere exisfs t' s.l. T h t': Vw :T. P w (30) 


and fhe following is semanlically provable, as fhe new goal, by induclion on h: 

r,w:T,h:Sw\-l:Qw (31) 

Inluifively, Ihis means fhaf fhe dependency of fhe proof on w can be liffed fo a fype dependency, given a 
sufficienfly close analogy belween T as modular inducfive dalafype and S as modular inductive predicafe, 
Iherefore by ralher using h of fype 5 w as inducfive argumenl. Again, we need fo expose fhe inducfive 
slruclure by shilling fo 


T,w.T,h-.R(P\x^' T R)w'rl' -.Qw (32) 

and Ihis is nol problemalic. However, as before, we end up stock wilh case analysis ralher than proper 
induction. In order to solve this problem, we need to look at an alternative encoding of fixed poinls, based 
on Mendler-slyle induction ifT^ [TTl. In facl, Mendler’s approach makes if possible to build induction 
principles info impredicalively encoded fixed poinls. Nofice fhaf Mendler algebras are used by Delaware 
et al. Q, buf have a differenl purpose fhere (i.e., confrolling fhe order of evalualion), from fhe one we 
are proposing here. 
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4 Mendler algebras 

We first present the Mendler-style semantics of inductive datatypes by introducing Mendler algebras as 
a category, following Uustalu and Vene |[T8l . Given a covariant functor F : S —?■ S, a Mendler algebra is 
a pair (C,*F) where C : S is the carrier and A : (A —)■ C) —)■ (F A —)• C), for each A : S, is a map from 
morphisms to morphisms satisfying 'P A f = C idc) • (fmap F /), with / a morphism from A to C. 
A morphism between Mendler algebras (Cij'Fi) and {€ 2 ,^ 2 ), is a morphism /z : Ci —)• C 2 that satisfies 
h-^’i Cl idci = *^^2 Cl h. The Mendler algebra semantics has been proved equivalent to the conventional 
one by Uustalu et at. Assume F such that the conventional initial F-algebra (/rF, inf) exists. Given the 
abbreviation 


preJof C (m : C —>/rC) =df inf • (fmap F m) : (F C —^/iC) (33) 

we can prove the equation 


inf = pre_inf /iF id (34) 

by the isomorphic character of inf. The Mendler algebra (/rF, pre_inf) can thus be shown to be the 
initial object in its category, and therefore used as alternative interpretation of the inductive datatype 
associated with F. For each Mendler algebra (Cj'F), the unique incoming morphism from the initial 
Mendler F-algebra can be defined 

mfold F C 'T X =df 'F (fiF) (mfold F C *F) (outf x) (35) 

Unlike fhe conventional fixpoint operator, the Mendler one can be encoded in Coq as an inductive 
datatype (though using the impredicative option). 


dt_defMFixF = preJn (C : S) (f?: C —)• MFix F) (c : F C) 


(36) 


However in, as defined by equafion (34i in this setting, is still not a constructor, and the definition of 
mfold is not structurally recursive. Therefore, also in this case, it seems more convenient to resort to an 
impredicative encoding, following ifT^ ITl. 


4.1 Impredicative Mendler algebra encoding 

Mendler algebras can be characterised impredicatively by the type of their structure maps, and a fixpoint 
operator can be defined as in the conventional case ifT^ ITl. 

Alg^ FC =d/ VA. (A^C) ^ (FA^C) (37) 

Fix^ F =df VC. Alg^ FC^C (38) 

Unlike the conventional case, the type of a Mendler algebra can be read as specification of an iteration 
step, where the bound type variable A represents the type of the recursive calls. The corresponding fold 
operator 

M6'^FCfx=dfxCf (39) 

indeed has type 

fold^ F C : (VA. (A ^ C) ^ (F A ^ C)) ^ (Fix^ F) ^ C (40) 
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which can represent an induction principle, under the assumption that the argument to the induction 
hypothesis is only used therein without further analysis ifT^ ITl. In-maps and out-maps can be defined as 
follows 

in'^ F {x : F(Fix'^ F)) : Fix^ F =df XA (/ : Alg^ F A), f (Fix^ F) (fold'^ F A f) x (41) 


out^ F (x : Fix'^ F) : F (Fix^ F) =df x {F (Fix'^ F)) 

{XA{r:A-^F (Fix^ F)) {a: FA). fmapF (Ay :A. in'^ F {ry)) a) 


(42) 


As in the conventional case, impredicative fixpoint definitions give us quasi-initiality. The uniqueness 
condition of fold'^ F A f that is needed for initiality, in a way which parallels (181, is given by 

(Vx:F (Fix'^ F). h {\n^ F x) = f (Fix^ F)hx) ^ h = fo\d^ F A f (43) 

to be proven for a fixed F, for every A : S, / : Alg''^ F A and h : Fix'^^ F ^ A ifTSl . 


4.2 Indexed Mendler algebras 

As before, we need indexed algebras to deal with relations. The definitions are similar to the conventional 
ones, with K a type, /? : (A" —)• P) —)> (A" —)> P) an indexed functor, and D : A" —)> P an indexed carrier. 

k\g^^ KRD=dfAIA. {Mw. K. Aw ^ D w) ^Mw. K. R Aw ^ D w (44) 


Fix'^' KRw=dfyA. Alg^' KRA-^Aw 


(45) 


fold'^' KRD{f: Alg'^' K R D) {w : K) {x : Fix'll KRw)=dfxD f (46) 

in^' KR{w.K){x:R (Fix'^' K R) w) : Fix^' KRw =df 

X A{f :A\g^'KRA).f {F\x^'KR) KRAf)wx ^ ’ 


out^' KR(w:K){x: Fix'll KRw):R (Fix^' K R) w = 

X {R (Fix^' K R)) (A A (r : Vv. A V ^ (Fix^' K R) v) (48) 

{w:K) {a :R Aw), f map'R{Xy: Aw. \n^' K Rw {rw y)) a) 

As an example, we can define inductively a relation Eval : (Trm * Val) —> P that agrees with eval. 

dt_def Evalc (A : (Trm * Val) —)■ P) : (Trm =t= Val) —P = 
evl : Vx:lnt. Evalc A (lit x, val x) 
ev2 : Vci 62 ■ Trm,xi X 2 ■ Val. A(ci,xi) A A{e 2 ,X 2 ) —t 

Evalc A (add(ei,e 2 ),val((vvxi) -F (vvx 2 ))) 

Eval =df Fix'^' (Trm=t:Val) Evalc 


(50) 
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4.3 Proof algebras, Mendler-style 


Reconsider the schematic example in Section |3i2{ the problem in ( [32| ) was the missing induction hypoth¬ 
esis, that cannot be obtained by appealing to the standard inductive principle, as the recursive occurrences 
are wrapped in a non-inductive type. Intuitively, this can be fixed by giving such an hypothesis explicitly. 
This would give us a generic representation of the step lemma in our inductive proof. 


r, /jo : Vv : T. Fix^' T Rv ^ Qv, w: T, hi : R (Fix^' T R)w P q:Qw (51) 


However, here the type of ho is actually too specific fo be fhaf of fhe induction hypofhesis wifh respecf fo 
hi - as a resulf, fhe sequenf is foo weak fo fake us fo fhe main goal ( [3T] ). Af Ibis poinf, Mendler’s infuifion 
comes info play: under fhe assumption fhaf fhe argumenf passed fo fhe inducfion hypofhesis is used only 
fhere, wifhouf furlher case analysis, and fhaf Iherefore we make no use of ifs fype strucfure, ifs type can 
be represented by a fresh fype variable - fhe key fealure of Mendler-sfyle inducfion ifT^ fTIl. We can fhen 
sfrengfhen ( [5T] ) fo fhe following, more absfracf goal. 

r, A : Type, ho -.yv: T. Av ^ Q v, w : T, hi : R A w h p : Qw (52) 


Given / A A ho w hi. p, fhe above is equivalenf fo 

r h / : Alg'^' TRQ 


(53) 


Now we have an indexed Mendler algebra. The original goal, equivalenf fo ( [25] ) by a reformulafion of 
(301 wifh S = Fix'^' T R, can fhen be obfained by folding, wifhouf need of furfher adjusfmenfs. 


r h fold'^' T RQf : yw:T.Sw-^Qw 


(54) 


In order fo prove ( |5^ , case analysis (as provided in Coq e.g. by inversion and destruct facfics Q) 
can be applied fo hi, allowing us fo reason on fhe sfrucfure of 7? A w. This acfually resulfs in doing 
inducfion on fhaf sfrucfure, as fhe inducfion hypofhesis ho is already fhere. In fhis way, we can minimise 
fhe overhead of combining inducfive proofs wifh modular dalafypes. Proving an inducfive lemma boils 
down fo consfrucfing fhe appropriate Mendler algebra - fhe resf is eifher convenfional, or comes for free. 
In connection wifh MDT, such algebras can be regarded as proof modules, fhaf can be composed fogefher 
in fhe usual sense of case analysis on coproducfs in fhe same slraighlforward way as evaluation 

algebras (fhe original mofivafing example by Swiersfra OH). This sounds attractive, from fhe poinf of 
view of fhe applicafions in which fhe relafional aspecf is predominanf, such as sfrucfural operational 
semantics. 


4.4 Problematic aspects 

Which could be fhe downsides of fhe Mendler-based approach? As already observed, retying on im- 
predicafive encodings gives us for free only a weak semantics of inductive dafafypes, i.e., a quasi-inifial 
one. However, inifialify is needed virfually everywhere in our proofs, fo ensure in-maps and ouf-maps 
are inverses, i.e. 


(A) out^ F (in^ Fx) = X {B) in^ F (out^ F x) = x (55) 

and similarly for fhe indexed case. In order fo gef proper initial semanfics, funcfor-specific proofs of 
properfies such as (431 for base category S, or fhe corresponding one for .S' —)> P, need fo be carried ouf. 
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This may be regarded as a general weakness of impredicative approaches including MTC/3MT Ql^, 
as remarked by Keuchel and Schrijvers ifTTI . Nonetheless, in discussing the well-formedness of Church 
encodings fTl, Delaware et al. argue that dealing with this issue is not too hard, as indeed MTC provides 
automation for doing so. 

A more specific problem is related to the iterative character of Mendler-style recursion, and corre¬ 
spondingly, to the non-dependent character of Mendler-style induction. Mendler algebras make it pos¬ 
sible to factor induction into case analysis and folding, but this restricts induction, in the sense of what 
is called Mendler iteration by Abel, Matthes and Uustalu lU : the argument of the induction hypothesis 
cannot be used anywhere else, effectively ruling out dependent induction. This means there are problems 
that cannot be solved in their original form. As an example, MTC Q proves the type soundness of a lan¬ 
guage with a dynamic semantics that is recursively defined as a total evaluation function. This problem 
can be reformulated with respect to our concrete example in Section|^ using our definition of eval ( [T^ . 

r,e : Trm,t : Typ h k : TypOf {e, t) — > TypOf (litovv (eval e), t) (56) 


Using the MTC approach, (561 can be proved by dependent induction on the structure of term e. Given 
dt_def Typ = N and assuming for simplicity TypOf is a conventional inductive predicate 


dt_def TypOf: Trm * Typ —)■ P = 

tofl : Vv : Val. TypOf (litovv v, N) (57) 

tof2 : Vei e 2 : Trm. TypOf (ei, N) ATypOf (^ 2 , N) —)• TypOf (add(ei,e 2 ), N) 


the proof is ultimately based on a proof algebra of type Alg'- Trmc (Se. Vt : Typ. TypOf (e, t) —)> 
TypOf (litovv (eval e), t)), although as already noticed, folding this algebra only gives us the backbone 
of the whole proof. 

This is not possible using our Mendler-style approach, as we cannot deal with the dependency of the 
goal on the inductive argument e. What we can do instead, is to rely on the relational formulation of 
evaluation given by Eval (501, which can be shown to satisfy (30), and prove 

r,e : Trm,v : Val, t : Typ, h : Eval (e,v} h I: TypOf (e, t) —)• TypOf (litovv v, t) (58) 


reasoning by induction on the structure of Eval. This reformulation of the goal essentially matches ( pTj ). 
In this case, a proof can be obtained by simply folding an indexed Mendler algebra of type Alg'^' (Trm * 
Val) Evalc (A(e,v). Vf : Typ. TypOf {e, f)—)• TypOf (lit ovvv, t)), which provides our instance of (j5^. 

An alternative way to obtain a relational equivalent of ( [56| ) is to lift the modular datatype Trm to a 
modular predicate IsTrm : (Trmc Trm) —)> P, with IsTrm = 4 f Fix''^' (Trmc Trm) IsTmic, where 

dt_def IsTrmc A = isLit: Vx : Int. IsTrmc A (lit x) 

I isAdd : Vei e 2 '■ Trm. Aei A A e 2 -A IsTmic A (add (ei,e 2 )) 

and then prove 

r,e : Trm,w : IsTrm e,t : Typ h k : TypOf {e, t) —> TypOf (litovv (eval e), t) (60) 


reasoning by Mendler induction on w. Notice that eval in the MTC example f7| is actually defined as 
the fold of a Mendler algebra, rather than a conventional one, in order to allow for control over the 
evaluation order - this is related to the form of their semantics though, and completely unrelated to our 
use of Mendler-style induction. 
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5 Case study 


The use of relational formulations appears particularly natural in specifications based on small-step rules 
in the style of SOS, originally introduced by Plotkin |T4|. Yet in order to formulate each relation mod- 
ularly, we need to build encodings based on functors that reflect the structure of those relations. This 
inevitably makes things more complex, especially when we have to deal with mutually inductive defi¬ 
nitions. In order to test the applicability of Mendler proof algebras to the formalisation of a semantic 
framework, we have formalised a language .if with a comparatively rich syntactic structure, including 
types (Typ), patterns (Pat), declarations (Dec) and expressions (Exp), as well as value environments 
(Env^) and typing environments (Env^). We rely on SOS to give a partial specification of the language: 
partial, insofar as we do not specify any behaviour in case of pattern matching failure - therefore, we 
cannot prove type soundness, which in fact does not hold. However, we can still prove type preservation 
- and this suffices for us, as an example of the structural complexity we are aiming at. 

The full language specification is available with the Coq formalisation in the companion code at 
|[T7I . Here we outline the specification using conventional dataytpes. The Coq formalisation is entirely 
based on modular datatypes, although for simplicity we rely on monolithic functors (we have not yet 
implemented the smart constructor mechanism that facilitates the use of coproducts). 


dt_def 

Typ = 

dt-def 

Pat = 

dt_def 

Dec = 

dt_def 

Exp = 


EnvA Id —> option A 


ty(ld''') I Typ=^Typ | type_env(Env''') 
vrP(ld,Typ) | cnP(ld,Typ) | applyP(Pat, Pat) 
env(Env^) | match(Pat, Exp) | join(Dec, Dec) 
vr(ld) I cn(ld,Typ) | closure(Env^, Pat, Exp) 

I apply(Exp, Exp) | scope(Dec, Exp) 

Env^ =^f Env Typ Env^ Env Exp 


(61) 


(62) 


The language .if is based on simply typed lambda calculus with pattern matching and first class environ¬ 
ments. We use two sets of identifiers - Id^ for type variables and Id for object variables and constants. 
Constants and pattern variables are annotated with types. is the usual function type constructor. We 
use closures instead of lambda abstractions to ensure values are closed terms and avoid dealing with sub¬ 
stitution. Abstraction is defined over patterns (rather than simply over variables). Matching patterns with 
expressions give declarations, which may evaluate to environments. Declarations can be joined together 
and used in scope expressions. Values can be specified as follows. 


Data values : h G cn(x, t) | apply(/i,v) 

Values: v G closure(p,p,e) |/i 


(63) 


The typing relations have the following signatures. Notice that patterns and values can be typed in a 
context-free way, unlike expressions and declarations. 


Patterns : 
Environments : 
Declarations : 
Expressions : 


TypOPat 

TypOEnv 

TypODec 

TypOExp 


Pat* Typ —)• P 
Env^ * Env^ —^ P 
Env^ * Dec * Typ —P 
Env^ * Exp * Typ —> P 


The transition relations have the following signatures. 


Declarations : DecStep : Env^ * Dec * Dec —)■ P 

Expressions : ExpStep : Env^ * Exp* Exp —P 


(64) 


(65) 
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Expressions and declarations may depend on each other, and therefore can only have a mutually inductive 
definition. Analogously, the definitions of the typing relations and of the transition relations for these 
two syntactic categories involve mutual induction. Therefore we need to introduce functors to reason 
about mutually inductively defined sefs, as well as mufually inducfively defined relations. 


5.1 Mutually inductive sets 

Two mufually recursive dafafypes in fhe base cafegory S, can be represenfed in terms of bi-funclors 
Fi, F 2 : S * S —)■ S, where bi-funclorialily is expressed as existence of a map fmap*^ which satisfies fhe 
appropriate form of fhe usual preservafion properties. 

fmapD; V {Ai A 2 B 2 : S} (/i :Ai ^Bi) {f^-. A 2 ^ B 2 ). F {A^M) ^ F { 81 , 82 ) (66) 

fmap° gi g2 (fmap^ /i / 2 ) = fmap^ (gi -/i) (g2 •/2) 
fmap° idA idg = id^B 


The definilions of Mendler bi-algebra, fixpoinl and fold operafors can be given using pairs. 

AlgD (Fi,F 2) (Ci,C2) =df (VA1A2. (Ai ^Ci)^(A2^C2)^Fi (Ai,A2)^Ci, 

VAi A2. (Ai ^ Cl) ^ (A2 ^ C2) ^ F2 (Ai,A2) ^ C2) 


FixD(Fi,F2) =df (VA1A2. AlgD (Fi,F2) (Ai,A2)^Ai, 
VAi A2. AlgD (Fi,F 2 ) (Ai,A 2 ) ^ A2) 


(69) 


fold? (Fi,F 2) (Ci,C2) (/: AlgD {FuF2) (Ci,C2)) : 

fst (Fix^ (Fi,F 2 )) ^ Cl =df Xe. eCiC2f 


(70) 


foIdD (Fi,F2) (Ci,C 2) (/ : AlgD (^1,^2) (Ci,C2)) : 

snd (FixD {f,,F 2))^C2 =df Xe.eCiC2f 


(71) 


All fhe synfaclic cafegories of .if can fhen be represenfed as MDTs, using bi-funcfors for mufually 
defined DecI and Exp. 


Typ =df Fix^ Type 
Pat =df Fix'^ Pate 


dt_def Type T = ty(ld''^) | T^T \ type.env (Env^ T) 
dt_def Pate P = vrP(ld,r) | cnP(ld,r) | applyP(/’,B) 
dt_def Decc 77 E = env(Env E) | match(Pat,E) | join(D,D) 

dt_def Expe 77 E = vr(ld) | cn(ld,Typ) | closure(Env E, Pat,E) | apply(E,E) | scope(D,E 
Dec =df fst (Fix° (DecG,ExpG)) Exp =df snd (Fix^ (DecG,ExpG)) 


(72) 


5.2 Mutually inductive relations 

Given types K\,K 2 , two mutually recursive relations depending on such types in base categories Ki —P, 
K 2 —)■ P, can be represented by indexed bi-functors Ei ,^ 2 , with 


El El : (El ^ P) * (E 2 ^ P) ^ (El ^ P) 


82 El : (El ^ P) * (E 2 ^ P) ^ (E 2 ^ P) (73) 



P. Torrini & T. Schrijvers 


155 


characterised by maps 


fmapH {KuK 2) Ri : V {Aj A 2 : ^ P} B 2 : ^2 ^ P}- 

(Vw : ^ 1 . Ai w —Bi w) —)• (Vw : K2. A 2 w —)• B 2 >v) —^ ( 74 ) 

Mw.Ki. R\ (Ai,A 2 ) w —> (81,82) w 

fmapH (/:i,^2) Ri : V {Aj A 2 : ^ P} {Bi B 2 : ^2 ^ P}- 

(Vw : . Ai w —)• Bi w) —)• (Vw : K 2 . A 2 w ^ 82 w) ^ (75) 

Vw : ^2- ^2 (Ai ,A2 ) w —>■ /?2 ( 81 , 82 ) w 

Given carriers Di : A'l —)• P, D 2 ■ K 2 ^ P, we can now define indexed Mendler bi-algebras and the 
associated notions (see ifTTll for more details). 

AIgH (Ki,K2) (Ri,R2) (Di,D2 ) =df 

(VAi A 2 . (Vw : A'l. Ai w —> Di w) —> (Vw : A' 2 . A 2 w —> D 2 w) —> 

Vw : A'l-(A i,A 2) w —Di w, ( 76 ) 

VAi A2. (Vw : A'l. Ai w —> Di w) —> (Vw : A'2. A2 w —> D2 w) —> 

Vw : K 2 . R 2 (Ai,A 2 ) w — )> D2 w) 


FixH (KuK 2 ) (RuRi) =df 

(Aw:/:i.VAiA 2. AIgH (^1,^2) (RuRi) (Ai,A2)^Ai w, ( 77 ) 

Aw : 75:2. VAi A2. AIgH (75:1,7:2) ( 7 ?i, 7 ? 2 ) (Ai,A2)^A2 w) 


fold^ (7:1,75:2) (7 ?i,7 ?2) (77 i,D2 ) (/: AIgH (75:1,7:2) ( 7 ?i, 7 ? 2 ) (£>1,02)) (w: Ki) : 
fst (Fix*^ (K\,K2) (7 ?i,7 ?2 )) w —)■ Di w Aw e. e D\ D2 f 


(78) 


foIdH (7:1,7:2) (Ri,R 2 ) (DuD 2 ) (/: AIgH (7:1,7:2) ( 7 ?i, 7 ? 2 ) ( 77 i,D 2 )) (W.K2) : 

snd (Fix^^ ( 75 : 1 , 7 : 2 ) (7?i,7?2))w^D 2W =d/Awe. eDiD 2 / ^ ’ 

While the typing relations for patterns TypOPat can be represented modularly using an indexed functor 
and Fix', the corresponding relations for declarations and expressions, i.e. TypODec and TypOExp 
respectively, are mutually defined and therefore need to be represented as indexed bi-functors closed 
by Fix*^. Such is also the case for DecStep and ExpStep, which can be defined as follows, given the 
corresponding indexed bi-functors DecStepc : (Env^* Dec* Dec —?> P, Env^ * Exp* Exp —)> P) —)■ Env^* 
Dec*Dec—)• P, and ExpStepc : (Env^*Dec*Dec—;■ P, Env^*Exp*Exp—)■ P) —)■ Env^*Exp*Exp—;■ P. 

DecStep =df fst (Fix*^ (Env^ * Dec* Dec, Env^ * Exp * Exp) (DecStepc, ExpStepc)) (80) 

ExpStep =df snd (Fix'^ (Env^ * Dec * Dec, Env^ * Exp* Exp) (DecStepc, ExpStepc)) (81) 

5.3 Type preservation 

Type preservation in .if can be expressed as follows 

r,p : Env^ h (V(<ii d 2 ■ Dec). DecStep (p,d\,d 2 ) —^ DecTSafe (p,d\,d 2 ) 

A (V(ci 62 : Exp). ExpStep (p, 61 , 62 ) —^ ExpTSafe (p, 61 , 62 ) 
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where 


DecTSafe {p,d\,d2) =df V(r : Typ) ( 7 : Env"''). 

TypOEnv (p, 7 ) ^ TypODec ( 7 ,r/i,f) ^ TypODec ( 7 ,r/ 2,0 
ExpTSafe {p,ei,e2) =df V(f: Typ) ( 7 : Env'''). 

TypOEnv (p, 7 ) -^TypOExp {y^eid) ^TypOExp ( 7 ,^ 2 ,!) 

The context T includes premises of shape 

[IN X = IN y) {x = y) (84) 


where IN is the in-map for one of the datatypes - such premises can be discharged when the correspond¬ 
ing initiality conditions (431 are proven. It also includes premises of shape 


\/x : DgjIsDg X. (85) 

where Dg is the unfolding of a modular datatype D, and IsDg is the unfolding of a modular predicate IsD 
that represents the relational lifting of D, in the sense of our example ( [59] ). Such premises are needed, 
as the proof involves sublemmas that are proved by induction on the syntactic categories - and so, for 
instance. Type Typ has to be lifted to IsTypc : (Type Typ —P) —Type Typ —P. 

Crucially, the pair of DecTSafe and ExpTSafe can be a carrier for the indexed bi-functor determined 
by DecStep and ExpStep. In order to prove type preservation by mutual induction on the structure 
of DecStep and ExpStep, we define an indexed Mendler bi-algebra that has (DecTSafe, ExpTSafe) as 
indexed carrier, where the index types are Env^ * Dec * Dec and Env^ =t= Exp * Exp 

TPAIg =cif Alg'^ (Env^ * Dec * Dec, Env^ * Exp * Exp) 

(DecStepc, ExpStepc) (DecTSafe, ExpTSafe) 

After finding proofs fi : fst TPAIg and /2 : snd TPAIg, we can consfruct a proof of ( [ 8 ^ by applying to 
them fold^ and fold^, respectively (see iflTl for details). 

6 Conclusion 

Motivated by the importance of modularity in program development, semantics and verification, we have 
discussed the use of MDTs, their semantic foundations and their impredicative encoding along the lines 
of existing work d [HI [HI . We have shown how impredicative MDT encodings based on Mendler 
algebras can be used to reason about inductively defined relations, in a way that is comparatively close to 
a more conventional style of reasoning based on closed datatypes, by providing a simpler notion of proof 
algebra, if less general, than the one proposed by Delaware et al. Q. Our approach can be regarded as a 
novel application of Mendler-style induction |[T^ [T1[T^. as well as a technique that could be integrated in 
existing frameworks based on the impredicative encoding, such as MTC/3MT dl^. Mendler’s original 
insight ifT^ was in the semantics of inductive datatypes - the case made here, is for using that insight as 
a modular proof technique. From the point of view of possible applications to semantics and verification 
in frameworks such as OTT (HI, the relational style that can be supported seems to fit in well with SOS 
and in particular with component-based approaches, such as the one proposed by Churchill, Mosses, 
Sculthorpe and Torrini |3|. Our plans for future work include integrating our technique in MTC/3MT, 
and comparing this approach with the container-based one proposed by Keuchel and Schrijvers (TTll . 
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